๐Ÿฟ๏ธ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
๐Ÿค– reinforcement learning
Learning Real-World Acrobatic Flight from Human Preferences
arxiv.orgยท17h
๐Ÿงฉoperations research
Model Merging โ€“ A Biased Overview
crisostomi.github.ioยท4hยท
Discuss: Hacker News
๐Ÿ“Šlinear programming
On Zero-Shot Reinforcement Learning
arxiv.orgยท2d
๐Ÿงฉoperations research
Linear Dynamics meets Linear MDPs: Closed-Form Optimal Policies via Reinforcement Learning
arxiv.orgยท1d
๐Ÿ“Šlinear programming
LLM-Driven Intrinsic Motivation for Sparse Reward Reinforcement Learning
arxiv.orgยท17h
๐Ÿ“Šlinear programming
Adaptive Harmonic Mitigation in Distributed Power Converters via Reinforcement Learning
dev.toยท1hยท
Discuss: DEV
๐Ÿงฉoperations research
I reverse-engineered a bug in my PPO agent that gave it a 9x performance boost
theprincipledagent.comยท1dยท
Discuss: Hacker News
๐Ÿฆ€Rust
StepWiser: Stepwise Generative Judges for Wiser Reasoning
arxiv.orgยท17h
๐Ÿงฉoperations research
[P] AI Learns to play Sonic 2 Emerald Hill (Deep Reinforcement...
youtube.comยท2d
๐Ÿงฉoperations research
History Rhymes: Accelerating LLM Reinforcement Learning with RhymeRL
arxiv.orgยท17h
๐Ÿงฉoperations research
HAEPO: History-Aggregated Exploratory Policy Optimization
arxiv.orgยท17h
๐Ÿงฉoperations research
Reinforcement Learning-based Control via Y-wise Affine Neural Networks (YANNs)
arxiv.orgยท2d
๐Ÿงฉoperations research
MUA-RL: Multi-turn User-interacting Agent Reinforcement Learning for agentic tool use
arxiv.orgยท17h
๐Ÿงฉoperations research
RLMR: Reinforcement Learning with Mixed Rewards for Creative Writing
arxiv.orgยท17h
๐Ÿ“Šlinear programming
Scalable Fairness Shaping with LLM-Guided Multi-Agent Reinforcement Learning for Peer-to-Peer Electricity Markets
arxiv.orgยท17h
๐Ÿ“Šlinear programming
Learning Interior Point Method for AC and DC Optimal Power Flow
arxiv.orgยท17h
๐Ÿ“Šlinear programming
Deep learning reveals antibiotics in the archaeal proteome
nature.comยท6hยท
Discuss: Hacker News
๐Ÿฆ€Rust
Collaborative-Online-Learning-Enabled Distributionally Robust Motion Control for Multi-Robot Systems
arxiv.orgยท1d
๐Ÿงฉoperations research
Introduction to Artificial Neural Networks โ€“ Part 1 (2013)
theprojectspot.comยท17hยท
Discuss: Hacker News
๐Ÿ“Šlinear programming
A 20-Year-Old Algorithm Can Help Us Understand Transformer Embeddings
ai.stanford.eduยท2hยท
Discuss: Hacker News
๐Ÿ“Šlinear programming
Loading...Loading more...
AboutBlogChangelogRoadmap